Building Large Language Models

AI
LLM
programming

Examples

TI-84 GPT4All and Youtube

How to add custom GPTs to any website in minutes.


Libraries

Run a variety of LLMs locally using Ollama

Oolama main site

Ollama models supported

Collecting Data

Scripts to convert Libgen to txt (see also Explaining LLMs) ## Technical Details

from my question to Metaphor.systems

https://jalammar.github.io/illustrated-transformer/ https://huggingface.co/

Google’s free BERT model, a small-sized model for language

OpenAI Cookbook a Github repo of examples of using the OpenAI API.

GPT in 60 lines of NumPy via HN

How to Build

2023 summary from Simon Willison: a good list of resources for how to build your own LLM.

The Mathematics of Training LLMs — with Quentin Anthony of Eleuther AI

deep dive into the viral Transformers Math 101 article and high-performance distributed training for Transformers-based architectures.

An observation on Generalization: 1 hr talk by Ilya Sutskever, OpenAI’s Chief scientist. He’s previously talked about how compression may be all you need for intelligence. In this lecture, he builds on the ideas of Kolmogorov complexity and how neural networks are implicitly seeking for simplicity in the representations that they learn. He provides a clarity of thought that is rarely seen in the industry around generalization of these novel systems.

Brendan Bycroft wrote a well-done step-by-step visualization of how an LLM works

Welcome to the walkthrough of the GPT large language model! Here we’ll explore the model nano-gpt, with a mere 85,000 parameters.

Its goal is a simple one: take a sequence of six letters:

C B A B B C and sort them in alphabetical order, i.e. to “ABBBCC”.

Bycroft’s Visual Step-by-Step Description of an LLM